Córdoba
Short-Term Regional Electricity Demand Forecasting in Argentina Using LSTM Networks
This study presents the development and optimization of a deep learning model based on Long Short-Term Memory (LSTM) networks to predict short-term hourly electricity demand in Córdoba, Argentina. Integrating historical consumption data with exogenous variables (climatic factors, temporal cycles, and demographic statistics), the model achieved high predictive precision, with a mean absolute percentage error of 3.20\% and a determination coefficient of 0.95. The inclusion of periodic temporal encodings and weather variables proved crucial to capture seasonal patterns and extreme consumption events, enhancing the robustness and generalizability of the model. In addition to the design and hyperparameter optimization of the LSTM architecture, two complementary analyses were carried out: (i) an interpretability study using Random Forest regression to quantify the relative importance of exogenous drivers, and (ii) an evaluation of model performance in predicting the timing of daily demand maxima and minima, achieving exact-hour accuracy in more than two-thirds of the test days and within abs(1) hour in over 90\% of cases. Together, these results highlight both the predictive accuracy and operational relevance of the proposed framework, providing valuable insights for grid operators seeking optimized planning and control strategies under diverse demand scenarios.
Towards culturally-appropriate conversational AI for health in the majority world: An exploratory study with citizens and professionals in Latin America
Peters, Dorian, Espinoza, Fernanda, da Re, Marco, Ivetta, Guido, Benotti, Luciana, Calvo, Rafael A.
There is justifiable interest in leveraging conversational AI (CAI) for health across the majority world, but to be effective, CAI must respond appropriately within cultur ally and linguistically diverse context s . Therefore, we need ways to address the fact that current LLMs exclude many lived experience s globally . Various advances are underway which focus on top - down approaches and increas ing training data . In this paper, we aim to complement these with a bottom - up locally - grounded approach based on qualitative data collected during participatory workshops in Latin America. Our goal is to construct a rich and human - centred understanding o f: a) potential areas of cultural misalignment in digital health; b) regional perspectives on chatbots for health and c) strategies for creating culturally - appropriate CAI; with a focus on the understudied Latin American context . Our findings show that academic boundaries on notions of cultur e lose meaning at the ground level and technologies will need to engage with a broad er framework; one that encapsulates the way economics, politics, geogr aphy and local logistics are entangled in cultural experience. To this end, we introduce a framework for ' Pluriversal Conversational AI for H ealth ' which allows for the possibility that more relationality and tolerance, rather than just more data, may be called for .
Scaling Sequential Recommendation Models with Transformers
Zivic, Pablo, Vazquez, Hernan, Sanchez, Jorge
Modeling user preferences has been mainly addressed by looking at users' interaction history with the different elements available in the system. Tailoring content to individual preferences based on historical data is the main goal of sequential recommendation. The nature of the problem, as well as the good performance observed across various domains, has motivated the use of the transformer architecture, which has proven effective in leveraging increasingly larger amounts of training data when accompanied by an increase in the number of model parameters. This scaling behavior has brought a great deal of attention, as it provides valuable guidance in the design and training of even larger models. Taking inspiration from the scaling laws observed in training large language models, we explore similar principles for sequential recommendation. We use the full Amazon Product Data dataset, which has only been partially explored in other studies, and reveal scaling behaviors similar to those found in language models. Compute-optimal training is possible but requires a careful analysis of the compute-performance trade-offs specific to the application. We also show that performance scaling translates to downstream tasks by fine-tuning larger pre-trained models on smaller task-specific domains. Our approach and findings provide a strategic roadmap for model training and deployment in real high-dimensional preference spaces, facilitating better training and inference efficiency. We hope this paper bridges the gap between the potential of transformers and the intrinsic complexities of high-dimensional sequential recommendation in real-world recommender systems. Code and models can be found at https://github.com/mercadolibre/srt
Space for Improvement: Navigating the Design Space for Federated Learning in Satellite Constellations
Kim, Grace, Powell, Luca, Svoboda, Filip, Lane, Nicholas
Space has emerged as an exciting new application area for machine learning, with several missions equipping deep learning capabilities on-board spacecraft. Pre-processing satellite data through on-board training is necessary to address the satellite downlink deficit, as not enough transmission opportunities are available to match the high rates of data generation. To scale this effort across entire constellations, collaborated training in orbit has been enabled through federated learning (FL). While current explorations of FL in this context have successfully adapted FL algorithms for scenario-specific constraints, these theoretical FL implementations face several limitations that prevent progress towards real-world deployment. To address this gap, we provide a holistic exploration of the FL in space domain on several fronts. 1) We develop a method for space-ification of existing FL algorithms, evaluated on 2) FLySTacK, our novel satellite constellation design and hardware aware testing platform where we perform rigorous algorithm evaluations. Finally we introduce 3) AutoFLSat, a generalized, hierarchical, autonomous FL algorithm for space that provides a 12.5% to 37.5% reduction in model training time than leading alternatives.
MessIRve: A Large-Scale Spanish Information Retrieval Dataset
Valentini, Francisco, Cotik, Viviana, Furman, Damián, Bercovich, Ivan, Altszyler, Edgar, Pérez, Juan Manuel
Information retrieval (IR) is the task of finding relevant documents in response to a user query. Although Spanish is the second most spoken native language, current IR benchmarks lack Spanish data, hindering the development of information access tools for Spanish speakers. We introduce MessIRve, a large-scale Spanish IR dataset with around 730 thousand queries from Google's autocomplete API and relevant documents sourced from Wikipedia. MessIRve's queries reflect diverse Spanish-speaking regions, unlike other datasets that are translated from English or do not consider dialectal variations. The large size of the dataset allows it to cover a wide variety of topics, unlike smaller datasets. We provide a comprehensive description of the dataset, comparisons with existing datasets, and baseline evaluations of prominent IR models. Our contributions aim to advance Spanish IR research and improve information access for Spanish speakers.
Impact of PSF misestimation and galaxy population bias on precision shear measurement using a CNN
Weak gravitational lensing of distant galaxies provides a powerful probe of dark energy. The aim of this study is to investigate the application of convolutional neural networks (CNNs) to precision shear estimation. In particular, using a shallow CNN, we explore the impact of point spread function (PSF) misestimation and `galaxy population bias' (including `distribution bias' and `morphology bias'), focusing on the accuracy requirements of next generation surveys. We simulate a population of noisy disk and elliptical galaxies and adopt a PSF that is representative of a Euclid-like survey. We quantify the accuracy achieved by the CNN assuming a linear relationship between the estimated and true shears and measure the multiplicative ($m$) and additive ($c$) biases. We make use of an unconventional loss function to mitigate the effects of noise bias and measure $m$ and $c$ when we use either: (i) an incorrect galaxy ellipticity distribution or size-magnitude relation, or the wrong ratio of morphological types, to describe the population of galaxies (distribution bias); (ii) an incorrect galaxy light profile (morphology bias); or (iii) a PSF with size or ellipticity offset from its true value (PSF misestimation). We compare our results to the Euclid requirements on the knowledge of the PSF model shape and size. Finally, we outline further work to build on the promising potential of CNNs in precision shear estimation.
Hyperparameter optimization of hp-greedy reduced basis for gravitational wave surrogates
Cerino, Franco, Diaz-Pace, Andrés, Tassone, Emmanuel, Tiglio, Manuel, Villegas, Atuel
In a previous work we introduced, in the context of gravitational wave science, an initial study on an automated domain-decomposition approach for reduced basis through hp-greedy refinement. The approach constructs local reduced bases of lower dimensionality than global ones, with the same or higher accuracy. These ``light'' local bases should imply both faster evaluations when predicting new waveforms and faster data analysis, in particular faster statistical inference (the forward and inverse problems, respectively). In this approach, however, we have previously found important dependence on several hyperparameters, which do not appear in global reduced basis. This naturally leads to the problem of hyperparameter optimization (HPO), which is the subject of this paper. We tackle the problem through a Bayesian optimization, and show its superiority when compared to grid or random searches. We find that for gravitational waves from the collision of two spinning but non-precessing black holes, for the same accuracy, local hp-greedy reduced bases with HPO have a lower dimensionality of up to $4 \times$ for the cases here studied, depending on the desired accuracy. This factor should directly translate in a parameter estimation speedup, for instance. Such acceleration might help in the near real-time requirements for electromagnetic counterparts of gravitational waves from compact binary coalescences. In addition, we find that the Bayesian approach used in this paper for HPO is two orders of magnitude faster than, for example, a grid search, with about a $100 \times$ acceleration. The code developed for this project is available as open source from public repositories.
A labeled dataset of cloud types using data from GOES-16 and CloudSat
Jure, Paula V. Romero, Masuelli, Sergio, Cabral, Juan Bautista
In this paper we present the development of a dataset consisting of 91 Multi-band Cloud and Moisture Product Full-Disk (MCMIPF) from the Advanced Baseline Imager (ABI) on board GOES-16 geostationary satellite with 91 temporally and spatially corresponding CLDCLASS products from the CloudSat polar satellite. The products are diurnal, corresponding to the months of January and February 2019 and were chosen such that the products from both satellites can be co-located over South America. The CLDCLASS product provides the cloud type observed for each of the orbit's steps and the GOES-16 multiband images contain pixels that can be co-located with these data. We develop an algorithm that returns a product in the form of a table that provides pixels from multiband images labelled with the type of cloud observed in them. These labelled data conformed in this particular structure are very useful to perform supervised learning. This was corroborated by training a simple linear artificial neural network based on the work of Gorooh et al. (2020), which gave good results, especially for the classification of deep convective clouds.
Which Argumentative Aspects of Hate Speech in Social Media can be reliably identified?
Furman, Damián, Torres, Pablo, Rodríguez, José A., Letzen, Diego, Martínez, Vanina, Alemany, Laura Alonso
With the increasing diversity of use cases of large language models, a more informative treatment of texts seems necessary. An argumentative analysis could foster a more reasoned usage of chatbots, text completion mechanisms or other applications. However, it is unclear which aspects of argumentation can be reliably identified and integrated in language models. In this paper, we present an empirical assessment of the reliability with which different argumentative aspects can be automatically identified in hate speech in social media. We have enriched the Hateval corpus (Basile et al. 2019) with a manual annotation of some argumentative components, adapted from Wagemans (2016)'s Periodic Table of Arguments. We show that some components can be identified with reasonable reliability. For those that present a high error ratio, we analyze the patterns of disagreement between expert annotators and errors in automatic procedures, and we propose adaptations of those categories that can be more reliably reproduced.
Extended method for Statistical Signal Characterization using moments and cumulants: Application to recognition of pattern alterations in pulse-like waveforms employing Artificial Neural Networks
Bustos, G. H., Segnorile, H. H.
We propose a statistical procedure to characterize and extract features from a waveform that can be applied as a pre-processing signal stage in a pattern recognition task using Artificial Neural Networks. Such a procedure is based on measuring a 30-parameters set of moments and cumulants from the waveform, its derivative, and its integral. The technique is presented as an extension of the Statistical Signal Characterization method existing in the literature. As a testing methodology, we used the procedure to distinguish a pulse-like signal from different versions of itself with frequency spectrum alterations or deformations. The recognition task was performed by single feed-forward back-propagation networks trained for the case Sinc-, Gaussian-, and Chirp-pulse waveform. Because of the success obtained in these examples, we can conclude that the proposed extended statistical signal characterization method is an effective tool for pattern-recognition applications. In particular, we can use it as a fast pre-processing stage in embedded systems with limited memory or computational capability.